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1. Introduction 



Often a reference material is certified based on data 
from more than one measurement method (or from 
more than one laboratory). This situation occurs when 
no single method can provide the necessary level of 
accuracy and/or when there is no single method whose 
sources of uncertainty are well understood and quanti- 
fied. The intent of using multiple methods is to realize 
systematic effects (biases) of individual methods as 
variation across the multiple methods results. The multi- 
ple methods should be chosen to avoid common sources 
of biases, which would invalidate the use of the variation 
in estimation of the uncertainty of the systematic effects. 

If the biases are statistically independent and are cen- 
tered around zero, then the certified value and the ex- 
panded uncertainty can be based on a f -interval [1]. 



Suppose X and s are the sample mean and sample stan- 
dard deviation of the results of n methods. The interval 
X ± t„-i>)ssl\n is a 95 % confidence interval on the 
population mean of the methods. Here f„-i.95 is the two- 
sided 95 percentile point of a f -distribution with n — \ 
degrees of freedom. 

There are two problems with the use of the f -interval. 
First, it rests on the assumptions that there is a popula- 
tion of methods whose biases are centered around zero 
and that the chosen methods are a random sample from 
the population. Second, when the number of methods is 
small, the factor ?„_i_95 can be very large. For example, 
if n = 2, then f„-i,95 = 12.7 and if n = 3, then f„-i.95 = 4.3. 
For comparison, if n is large, the value is close to 2. 
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To further explore the issues related to the certifica- 
tion from multiple methods, we present an example. 
Figure 1 summarizes the measurement results of two 
analytes for a reference material. The analyte Cd was 
analyzed by two methods. The mean and expanded un- 
certainty interval (coverage factor k = 2) [2,3] of each 
method are displayed on the top plot. Similarly, the 
analyte Hg was analyzed by two laboratories and the 
results are displayed in the bottom plot. In the Cd case, 
there appears to be agreement between the two methods. 
It may be reasonable to assume that there are no biases 
between the two methods. 

However, in the Hg case, there appears to be disagree- 
ment between the two laboratories. In the certification 
of this analyte, an uncertainty component for the sys- 
tematic effects of the laboratories must be considered. 
The two problems in using a ? -interval for this uncer- 
tainty component, discussed above, are present in the Hg 
data. 



It is the purpose of this paper to propose and justify 
a solution to the problem of certifying reference materi- 
als based on a small number of methods in which the 
systematic effects are not completely understood. We 
call this problem the two-method problem , although the 
number of methods may be three or four and laborato- 
ries may play the role of methods. Section 2 motivates a 
set of desirable criteria for a solution and reviews some 
of the existing solutions to the problem. Section 3 pre- 
sents a solution, called BOB, based on a Type B model 
[2,3] of the bias and discusses some implementation 
issues and related concerns. Section 4 gives a detailed 
worked example of BOB. Finally, Sec. 5 provides some 
concluding remarks. Appendix A covers some degrees 
of freedom issues. Appendix B presents a Bayesian jus- 
tification of BOB based on a hierarchical model. For a 
review of the context of the problem in chemical refer- 
ence materials, see Ref. [4]. 




ICPMS (n = 6) 



ID-ICPMS (n = 6) 



Method 




1(n = 4) 



2 (n = 20) 



Laboratory 



Fig. 1. Examples of measurement results. ICPMS means inductively coupled 
plasma mass spectrometer and ID-ICPMS means isotope dilution inductively cou- 
pled plasma mass spectrometry. The numbers in parentliesis are tlie number of 
measurements on which the results are based. The uncertainty intervals indicate 
expanded uncertainties with coverage factors k = 2. 
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2. Criteria for a Solution 

An important practical property for a solution to the 
two-method problem is that it is flexible enough to han- 
dle a wide variety of settings in a straightforward way. 
The variety of settings includes the following: (1) the 
existence and nonexistence of systematic effects in the 
methods; (2) the availability of two to four methods or 
laboratories and (3) the existence and nonexistence of a 
valid uncertainty evaluation for each method (i.e., 
within-method uncertainty). The alternatives in setting 
(1) are exemplified by the Cd and Hg results shown in 
Fig. 1. The Hg results are also relevant to setting (3). In 
this study, based on knowledge of the laboratories, there 
is reason to believe that the expanded uncertainty for 
Laboratory 2 is not valid. 

A property often considered desirable for a solution is 
that it should produce an expanded uncertainty interval 
that contains the measurement result of each of the 
methods. The justification for this property is that any of 
the methods may be the "correct" one since the biases 
are unknown. From a statistical point of view, this prop- 
erty is not necessary. Statistically, one requires that the 
expanded uncertainty interval is believed to include the 
unknown value of the quantity being measured (i.e., 
measurand [5]) with a stated level of confidence. Under 
the assumptions described in Sec. 1, the f-interval has 
the correct level of confidence. However, as stated 
above, if the number of methods is small, the interval 
may be impractically large. 

The solution should possess certain continuity and 
scaling properties. For example, if the solution has been 
applied in the two-method case and a third method 
becomes available, then the result should not change by 
a large amount. Related to the setting (1) described 
above, the result should not change abruptly as the sys- 
tematic effect goes to zero. 

In the interest of consistency with current interna- 
tional practice, the solution should not be at odds with 
the ISO uncertainty guidelines (ISO GUM) [2,3]. 
Briefly, the ISO guidelines involve expressing the mea- 
surement result as a function of quantities whose uncer- 
tainties can be evaluated. The uncertainties of these 
quantities are expressed as standard uncertainties, which 
are propagated to derive the standard uncertainty of the 
measurement result. The notation u(X) is used for the 
standard uncertainty of the quantity X. Along with the 
standard uncertainties are associated degrees of free- 
dom, which are propagated by the Welch-Satterthwaite 
formula [2,3]. From the degrees of freedom, a coverage 
factor k is determined based on the r -distribution. The 
expanded uncertainty is equal to the product of the 
standard uncertainty and the coverage factor, resulting 
in an interval with a given level of confidence. Often the 



degrees of freedom are large enough simply to use a 
coverage factor of A: = 2. 

Finally, the solution should be based on a rigorous 
statistical model. A statistical model grounds the solu- 
tion on a strong base. The formulation of such a model 
clarifies the assumptions of the solution. It also makes 
available a large literature of properties and results. Ap- 
pendix B addresses this issue. 

Before moving on to the proposed solution, we review 
currently available procedures. The f-interval approach 
has already been discussed. It has most of the above 
properties. However, as mentioned above, it depends on 
assumptions that may not be valid and may produce 
impractically large intervals when there are a small 
number of methods. Any similar procedure that esti- 
mates the uncertainties associated with the systematic 
effects of the methods based solely on the observed data 
will suffer from the same problems. This constraint was 
one of the guiding principles in the derivation of the 
proposed solution. 

The Schiller-Eberhardt procedure [6] has been used 
for some time with acceptable results. It is motivated by 
the desire for the expanded uncertainty interval to con- 
tain each of the individual method means. It does not fit 
into the ISO guidelines and is not based on a rigorous 
statistical model. It has an undesirable scaling property 
in that the uncertainty can only increase as the number 
of methods increases. 

Paule-Mandel [7] was developed as an ad hoc proce- 
dure to produce a summary value of results from meth- 
ods with differing biases and precisions. Recently, it has 
been given a firmer statistical foundation [8]. However, 
there are unresolved issues related to the uncertainty of 
the estimate. Additionally, it emphasizes methods with 
high precision. High precision does not imply low bias. 

One final "solution" is to not combine the results if 
there is an indication of systematic effects that are not 
understood. 



3. Type B IVIodel of Bias 

In this section, we present a framework for a solution 
to the two-method problem. The framework is ex- 
pressed in terms of the language of the ISO guidelines. 
The model has two components. The first component is 
the estimate of the population mean of the multiple 
methods. The second component is the deviation of this 
population mean from the unknown value of the mea- 
surand, i.e., the unknown bias of the population mean. 
The possible bias is modeled via a Type B distribution 
[2,3]. (The name BOB comes from Type B On Bias). 
Type B distributions present a means of incorporating 
the available information on the problem. Because they 
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are distributions, they can account for uncertainty in the 
information. Distributional forms should be chosen that 
capture the information in an effective and straightfor- 
ward way. These aspects will become more apparent in 
the specifics that follow. 

The measurement model is given by 



y=/A + /3, 



(1) 



where y is the unknown value of the measurand, jx is the 
equally weighted mean of the population means of the 
methods, and /3 is the possible bias of yu, as an estimate 
of y. We define /a as an equally weighted mean, because 
in the majority of reference material applications, it is 
difficult to quantify the relative biases of the the meth- 
ods. (Greek symbols are used here to emphasize that the 
quantities are unobserved and unknown.) Both jjl and ;S 
require estimates and uncertainties of these estimates. 
The natural estimate of jj. is the sample mean of the set 
of method results. Standard statistical theory gives the 
uncertainty of this quantity (see example of Sec. 4). For 
/3 it is most often the case in the present setting to 
assume that the best estimate is zero. However, it is 
recognized that there is uncertainty in the estimate. If 
the best estimate were not zero, then according to the 
ISO guidelines the measurement result should be ad- 
justed by the nonzero amount. 

What is required is a procedure to produce the uncer- 
tainty estimate of /3. To do this, the analyst places a 
probability distribution on the value j8 that best summa- 
rizes the available information. The top plot in Fig. 2 
displays a simple and useful distribution for this pur- 
pose, called the rectangular (also called uniform) distri- 
bution. The distribution models the bias as (1) centered 
at zero; (2) bounded between ±a ; and (3) equally likely 
to be anywhere between ±a . Under this assumption, the 
standard uncertainty of the bias estimate is equal to 
a/VB. 

The bottom plot in Fig. 2 in conjunction with the top 
plot justifies a reasonable choice of a. Here the Xi, X2, 
and X represent, respectively, the results of the two 
methods and the mean of the two results. Thus, a is 
equal to (X2 — Xi)l2. Under the measurement model of 
Eq. (1), this choice of a is equivalent to saying that the 
unknown value of the measurand is believed to be (1) 
centered at the mean of the two method results; (2) 
bounded between the two method results; and (3) 
equally likely to be anywhere between the two method 
results. 

There are other useful Type B distributions that can 
be placed on the bias. Another simple distribution is the 
normal distribution (see Fig. 3). The normal distribution 
places higher probability on values near the center of the 
distribution than values far from the center. It is also 



+a 



A 1 A A ^ 

Fig. 2. The rectangular (or uniform) distribution. 




-a 



+ a 



Fig. 3. The normal distribution. 

unbounded meaning that unlike the rectangular distribu- 
tion any value is possible. These qualities are repre- 
sented by the shape of the distribution. There are several 
ways of employing the normal distribution. If the analyst 
believes that there is a 95 % chance that the bias is 
bounded between ±a, then the standard uncertainty of 
the bias is a/2. As described above, a reasonable value 
for a is equal to (X2 — X\)I2. Note that although the 
normal distribution is unbounded, the use of it described 
above results in a smaller uncertainty for the bias than 
the rectangular assumption described above. It is impor- 
tant to note that in the ISO uncertainty procedure only 
the standard uncertainty matters and not the actual form 
of the distribution. 
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3.1 Implementation Issues 

The previous section described the general frame- 
work of the proposed solution to the two-method prob- 
lem. This section discusses some specific details and 
implementation issues that will arise in application. We 
emphasize that although the use of the rectangular dis- 
tribution was highlighted in the last section as a model 
for the possible bias, other distributions may be used in 
the general framework of BOB. The particular distribu- 
tion is best determined by the experimenter based on the 
knowledge of the measurement process, previous exam- 
ples, or assistance from a statistician experienced in the 
area. 

Often when there are multiple methods used, the 
methods are related. The top plot of Fig. 4 illustrates 
such a situation. There are four methods, but three of the 
four are related to each other. In this example, three of 
the methods are gas chromatography (GC) analyses and 
the forth method is neutron activation (INAA). It is 
likely that the three GC analyses are more related to 
each other than to the INAA analysis. The naive use of 
the f -interval approach would be misleading because 
these are not four independent methods. One procedure 
for handling this case is to combine the three GC results 
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Method 

Fig. 4. Multimethod examples. GCl, GC2, and GC3 represent gas 
chromatography using three different columns. INAA means instru- 
mental neutron activation analysis. The uncertainty intervals indicate 
expanded uncertainties with coverage factors k = 2. 



into a single GC result with an associated uncertainty. 
Using the combined GC result and the INAA result, the 
analyst can apply the Type B modeling described in this 
paper. 

The Cd results of Fig. 1 display another important 
case. In this case, there does not appear to be a between- 
method effect. The question arises when to apply the 
procedures described in this paper and when one can 
assume that there is not a between-method effect. One 
way of answering this question is to perform a f-test (or 
an F-test if the number of methods is greater than 2) on 
the difference between the two results [1]. The f-test, as 
typically employed with an a -level of 0.05, may favor 
the conclusion that there does not exist a between- 
method effect. This conclusion may result in underesti- 
mating the uncertainty. We recommend that if the f-test 
is used, that the analyst use an a -level of 0.5. Alterna- 
tively, the use of BOB with the rectangular distribution, 
as described above, may be effective. If there is not a 
between-method effect, then the results of the multiple 
methods should tend to be close to each other. In such 
a case the width of the distribution on the bias (and its 
uncertainty) will be small. Thus, there will be little 
penalty for including the effect when it is small. 

The last case we consider is displayed in the bottom 
plot of Fig. 4. Here the result of Method 1 (represented 
by the dot) has the lowest value among the four methods. 
However, the expanded uncertainty interval of Method 2 
extends below the intervals of the other three methods. 
In this case it may make more sense to define the Type 
B distribution of the bias based on the limits of the 
expanded uncertainties. In Appendix A, the presence of 
large within-method uncertainties is addressed with de- 
grees of freedom considerations. 



4. Example 

This section presents a worked example that displays 
the details of the BOB procedure using the rectangular 
distribution. The example is based on the Hg data dis- 
cussed in the body of the paper. 

Before starting the example, we review some neces- 
sary statistical results. Suppose Wi, W2, ■•• , W„ are n 
independent measurements. Let W and s{W) denote the 
sample mean and sample standard deviation, respec- 
tively. The standard uncertainty of a sample mean, from 
the random variation in the measurements, is equal to 



siwyVn. 



(2) 



The associated degrees of freedom for this uncertainty 
is « — 1 . In addition to the uncertainty from the random 
variation, there may exist uncertainty from systematic 
effects. 
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We will make multiple uses of the linear measure- 
ment equation given by 



Y=aW+bZ, 



(3) 



where a and b are fixed constants with no uncertainty 
and W and Z are quantities with uncertainty. Let the 
standard uncertainties of W and Zheu(W) and u (Z) and 
the associated degrees of freedom Vw and Pz. In all that 
follows, assume that W and Z are independent. From 
propagation of uncertainties [2,3], the standard uncer- 
tainty of Y is equal to 



u(Y) = \/aW{W) + bW(Z). 



(4) 



The associated degrees of freedom derived from the 
Welch-Satterthwaite formula [2,3], is 



Step 0: The Measurement Equation 

The measurement equation model is given by Eq. (1), 
repeated below: 



y=fj. + l3. 



(6) 



where y is the unknown value of the concentration, /j, is 
the equally weighted mean of the population means of 
the methods, and /3 is the bias of /x as an estimate of -y. 
Each quantity in the model must be estimated. (We use 
Latin letters to distinguish the estimates, which are ob- 
servable, from the unobservable unknown values. Un- 
certainties will be associated with the estimates, as op- 
posed to the unknown values.) The measurement 
equation relating the estimates is 



Y = X + B, 



(7) 



Vy- 



u\Y) 



a'u\W)lv„ + b%\Z)lvz' 



(5) 



Returning to the example. Table 1 gives the relevant 
summary statistics for the results from the two laborato- 
ries. For notation, let X\, SiiX), and rii be the summary 
statistics for Laboratory 1 and likewise, Xj, S2{X), and nj 
be the summary statistics for Laboratory 2. In order to 
make certain relationships explicit, we use the notation 
X-i and X2 to refer to the two laboratory results including 
all corrections. 

Table 1. Summary statistics for Hg results 



Lab 


1 


2 


Xi 


0.368 mg/kg 


0.310 mg/kg 


.v,(X) 


0.011 mg/kg 


0.0086 mg/kg 


n 


4 


20 


u{Sd 


0.006 mg/kg 





Laboratory 1, in addition to the measurement varia- 
tion, has a possible systematic effect. The uncertainty of 
the effect is quantified as a Type B source of uncer- 
tainty, referred to as u{Si). We assume that this uncer- 
tainty has infinite degrees of freedom. If it were possi- 
ble to identify all the systematic effects in each 
laboratory's measurement process and quantify the re- 
spective uncertainties then there would be no need to 
use the BOB procedure. 

Note in the following calculations, many more digits 
are maintained in the intermediate steps than are shown. 
This will lead to apparent discrepancies in the equations 
that follow, in which only a small number of digits are 
displayed. 



where Y is the final measurement result, X is the sample 
mean of X\ and X2, and B is equal to zero. The final 
measurement result is 



Y = X + B = ^iX,+X2) 
H- = i (0.368 + 0.310) mg/kg = 0.339 mg/kg. (8) 



We point out here that although the number of measure- 
ments for the two methods are not the same, we weight 
the results equally because there is no reason to believe 
one result is more accurate than the other. The next 
steps are the calculation of the uncertainties of X and B 
and their combination to obtain the uncertainty of Y. 

Step 1: Within-Method Uncertainty 

For each laboratory result, calculate the standard 
uncertainty. For Laboratory 2, the laboratory result is 
X2 = X2. The standard uncertainty u {X2) is given by the 
result for the sample mean [see Eq. (2)]. It is equal to 



u(X2) = u(X2) = S2(X)/Vn2 = ^'^^^ mg/kg 



20 



: 0.0019 mg/kg. 



(9) 



and the degrees of freedom is equal to Px^ = 20—1 
= 19. 

For Laboratory 1, the Type B uncertainty associated 
with the systematic effect must be included in the un- 
certainty. The systematic effect is assumed to be an 
additive effect. The resulting measurement equation is 



Xi=Xt+Si, 



(10) 
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where ^i is a correction that accounts for the possible 
systematic effect. The uncertainty of Xi is equal to 
u(Xi) = Si(X)/V^ = 0.01 1 mg/kg/V4 = 0.0055 mg/kg 
and has t'x, = 4 — 1=3 degrees of freedom. Although 
u(Si) is non-zero, the best estimate of ^i is zero. Using 
the results of Eqs. (3)-(5), with a = b= I and W = Xi 
and Z = S\, the standard uncertainty of the Laboratory 1 
result is 



M (X,) = Vm'(Xi) + u\Si) = V'0.0055' + 0.006' mg/kg 



= 0.0081 mg/kg, 
with associated degrees of freedom 



(11) 



Vx,= 



u\X,) 



0.008 r 



u\Xx)lvx, + u\Si)/vs, 0.0055V3 + 0.006Vc» 



= 14.4. 



(12) 



Note that the term 0.006*/°° is equal to zero. Table 2 
summarizes the within-laboratory uncertainties and de- 
grees of freedom. 



Table 2. Within-method uncertainties 
Lab 1 



u(X.) 

Vx, 



0.0081 mg/kg 
14.4 



0.0019 mg/kg 
19 



Step 2: Between-Method Uncertainty 

In the BOB procedure, a Type B distribution is used 
to account for the possible bias B in the average of the 
results of the methods. In this example, we use the 
rectangular distribution bounded by the two laboratory 
results for B, as described in Sec. 3, for this purpose. 
The standard uncertainty based on this distribution is 
equal to 



U(B): 



IXi-Xzl 10.368-0.3101 

2V3 ~ 3V3 

= 0.0167 mg/kg. 



mg/kg 



(13) 



Using Eq. 20 of Appendix A, the degrees of freedom for 
this quantity is 



Vb = 



l\ (X2-Xif 



l\ (0.368 - 0.310)" 



2/ uHXi) + u\X2) \2) 0.008P + 0.0019- 



= 24.0. 



(14) 



Step 3: Combining Uncertainties 

First, we calculate u{X). Recall X = \(Xi +X2) = \ Xi 



+ 1^2. 



Using Eqs. (3)-(5), with a = b= 1/2 
u(X) = 



^Vm\XO + {^'u\X,) 



1 0.0081" + i 0.0019" mg/kg = 0.0042 mg/kg (15) 



and the degrees of freedom of u iX) is equal to 

u\X) 



(^tu\X,)lvx, + Qd'u\X2)lvx 
0.0042* 



= 16.0. 



(16) 



{\f 0.008 1*/14.4 + i^f 0.0019''/19 ' 
Finally, from the measurement equation, Eq. (7), 
u{Y) = Vu\X) + u\B) = VO.0042" + 0.0167" mg/kg 
= 0.017 mg/kg (17) 

and the corresponding degrees of freedom is equal to 

u\Y) 



^^ u\X)Ivx + u\B)Ivb 
0.017* 



0.0042716.0 + 0.0167724.0 



27.0. 



(18) 



The final summary value and its standard uncertainty 
for the results of the two-laboratory study are 0.339 
mg/kg and 0.017 mg/kg. The degrees of freedom is 27. 
The multiplier for a 95 % level of confidence interval is 
2.1, which is based on a f -multiplier with 27 degrees of 
freedom (see Table B.l of Ref. [3]). The expanded un- 
certainty is equal to (2.1)(0.017) mg/kg = 0.036 mg/kg. 



5. Conclusion 

It was stated in Sec. 2 that a guiding principle in the 
derivation of BOB was the constraint that solutions that 
are based solely on the observed results will produce 
intervals whose widths are comparable to the f -interval 
with one degree of freedom, i.e., very large. In other 
words, two disparate methods give you effectively only 
two observations of information. BOB does not pull any 
more information out of the data. BOB overcomes the 
limitation by bringing in outside information about the 
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measurement processes and quantifying this information 
in terms of a Type B distribution. The particular distri- 
bution is best determined by the experimenter based on 
the knowledge of the measurement process, previous 
examples, or assistance from a statistician experienced 
in the area. In any given application, a reviewer of the 
uncertainty may disagree with the result. However, in 
BOB, the outside information appears explicitly and 
concretely and is open to evaluation. We believe this 
explicitness, which Bayesian approaches share, is a ma- 
jor strength of BOB. 

BOB also possesses many of the desirable criteria 
discussed in Sec. 2. In particular, it fits in the ISO 
framework, it is simple to implement, and it is related to 
a rigorous statistical model (see Appendix B). 



6. Appendix A. Degrees of Freedom 

The lower plot of Fig. 4 displays an example in which 
one of the within-method uncertainties is very large. In 
the basic use of the rectangular distribution presented, 
the values of the multiple method results are the input 
into the uncertainty evaluation, that is, u{B) = 1X2 — XiU 
V 12. If these method results have large uncertainties, 
the uncertainty evaluation of the possible bias may not 
be reliable. Degrees of freedom may be used to over- 
come this problem. Degrees of freedom can be thought 
of as the uncertainty in the uncertainty. Low degrees of 
freedom correspond to high uncertainty in the uncer- 
tainty. Formula G.3 of Ref. [2] provides an approxima- 
tion to the degrees of freedom of an estimated standard 
uncertainty. Using this formula for u{B) = 1X2 — XiU 

12, the degrees of freedom is 



(X2 - Xif 
u\\X^ - X2I) 



(19) 



We suggest the use of the approximation u^(\X2 — Xi\) 
~ u^iXi) + u^(X2). Using this approximation, the degrees 
of freedom is equal to 



(X2 - X,f 
u\XO + u\X2) 



(20) 



The approximation is good when IX2 — Xi\ is large 
relative to u{Xi) and u{X2). Under this condition, 
1X2 — Xi\ is equal to X2 — Xi with high probability or 
equal to Xi — X2 with high probability. If the condition 
is not true the approximation may be poor. Also, when 
the condition is not met, the use of the approximation 
will result inappropriately in very small degrees of free- 
dom. We recommend that the degrees of freedom for the 
bias be at least 3. A value of 3 is equivalent to a 42 % 



uncertainty in the uncertainty of the bias term. If Xi and 
X2 are normal, an exact formula for u^i\X2 — X\\) is 
possible based on the folded normal distribution [9]. 



7. Appendix B. Bayesian Model 

This appendix presents a Bayesian justification for the 
BOB procedure. It is more technical than the rest of the 
paper and uses standard notation for Bayesian statistics. 
See Ref. [10] for an introduction to Bayesian statistics 
and the notation used in this section. 

Let [xi, Si{x), Ml] and [x2, S2(x), «2) be the summary 
statistics for the two methods. Let /j,i and /j-2 be the 
population means of the two methods. These latter 
quantities represent the sample means of a conceptually 
infinite number of measurements. Let y be the unknown 
value of ultimate interest. 

One natural approach would be to build a hierarchical 
model around the conditional distribution of /j,i, iJ.2\y- 
We do not follow that path here, because the resulting 
uncertainty in y would reflect the one degree of free- 
dom problem we are trying to escape. Instead, we re- 
verse the situation and build a model around the distri- 
bution yl/Lti, /J.2- What this model will imply is that if one 
knew /j,i and /j,2, then there is no more information on y 
in the observed data. In other words, [xi, Si{x), n\\ and 
[x2, S2{x), M2) only provide information on /Xi and /X2, 
which in turn provide information on y. 

It is up to the scientists to answer the question: If you 
knew the results of an infinite number of measurements, 
i.e., jXi and /j,2, what is the distribution that reflects the 
uncertainty in y, the value of interest? In this appendix, 
we model /^(yl/j,!, 1^2) as a uniform distribution centered 
on {jXi + 1x2)12 and with full width l/.ti — /j,2l- 

We use the conjugate normal model with reference 
priors for the parameters as the models for the results of 
the two methods. The basic result of the conjugate nor- 
mal model is /^[/.(-ilxi, 5i(x)] is the distribution of the 
quantity Xi + [si{x)l\'ni\t„ ,_i, where f„ ,_i has a r -distri- 
bution with rti — I degrees of freedom. A similar result 
holds for p[fji.2\x2, S2ix)]. 

With plfjLilxi, Si(x)], p[fji2\x2, S2(x)], Rnd p(y\fjLi, 1X2) 
given, the posterior distribution p[y\xi, Si{x), X2, 52(x)] 
is completely specified. Since all the components are 
basic distributions, standard statistical software can be 
used to simulate from this posterior distribution. Figure 
5 shows the resulting posterior distribution for the Hg 
data of the paper based on a simulation of 10^ values. 
The sample mean and standard deviation from the simu- 
lation are 0.339 mg/kg and 0.018 mg/kg, respectively, 
compared with 0.339 mg/kg and 0.017 mg/kg from the 
results for the BOB procedure in Sec. 4. 
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Fig. 5. Simulated posterior distribution from Hg data. 



0.5 



An exact comparison of the mean and uncertainty for 
the BOB procedure and the Bayesian model is possible. 
In the following derivations, we suppress the depen- 
dence on the observed quantities. 



E(y) = E[E(ylM.,M.)] = E(^^)=^ (21) 



Var(y) = E[Var(yl/i,, /Xz)] + Var[E(7l/x,, /x^)] (22) 



= E|^^^^^)+Var(^^4i^) (23) 



1 , 1 

= Y2 [E'(/Xi-/J.2) + Var(/Ai-/X2)] + ^ Var(/j,i+/j.2) (24) 

= Y2 E^(/Ai - /A2) + 2 Var(/xi + /aj) 



(Xi - X2) 1 



12 



«i — 1 5f(x) n2 ~ 1 ^2(jc) 



3 [«! — 3 Ml n2 ~ 3 «2 



(25) 



(26) 



The mean from the BOB procedure is identical to that 
of Bayes model. The variance from BOB is 
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(Xi - Xz) _^}_ f sijx) ^ S2(x) 



12 



4 \ «i 



«2 



(27) 



which differs from the Bayes model in the second term. 
Future work will explore the Bayes model and general- 
izations of it. 



About the authors: Mark S. Levenson, William F. 
Guthrie, Hung-kung Liu, Mark G. Vangel, James H. 
Yen, are Nien-fan Zang are mathematical statisticians 
in the Statistical Engineering Division of the Informa- 
tion Technology Laboratory at NIST. David L. Banks, 
Keith R. Eberhardt, and Lisa M. Gill are former mem- 
bers of the Statistical Engineering Division of the Infor- 
mation Technology Laboratory at NIST. The National 
Institute of Standards and Technology is an agency of 
the Technology Administration, U.S. Department of 
Commerce. 



579 



